Weighted Local Least Squares Imputation Method for Missing Value Estimation
نویسندگان
چکیده
Missing values often exist in the data of gene expression microarray experiments. A number of methods such as the Row Average (RA) method, KNNimpute algorithm and SVDimpute algorithm have been proposed to estimate the missing values. Recently, Kim et al. proposed a Local Least Squares Imputation (LLSI) method for estimating the missing values. In this paper, we propose a Weighted Local Least Square Imputation (WLLSI) method for missing values estimation. WLLSI allows training on the weighting and therefore can take advantage of both the LLSI method and the RA method. Numerical results on both synthetic data and real microarray data are given to demonstrate the effectiveness of our proposed method. The imputation methods are then applied to a breast cancer dataset.
منابع مشابه
Missing value estimation for DNA microarray gene expression data: local least squares imputation
MOTIVATION Gene expression data often contain missing expression values. Effective missing value estimation methods are needed since many algorithms for gene expression data analysis require a complete matrix of gene array values. In this paper, imputation methods based on the least squares formulation are proposed to estimate missing values in the gene expression data, which exploit local simi...
متن کاملMissing Value Estimation for DNA Microarray Expression Data: Least Squares Imputation
Motivation: Gene expression microarray data sets often contain missing expression values. Robust missing value estimation methods are needed since many algorithms for gene expression analysis require a complete matrix of gene array values. In this paper, imputation methods based on the least squares and cluster structure are proposed to estimate missing values in the gene expression data, which...
متن کاملCollateral Missing Value Estimation: Robust Missing Value Estimation for Consequent Microarray Data Processing
Microarrays have unique ability to probe thousands of genes at a time that makes it a useful tool for variety of applications, ranging from diagnosis to drug discovery. However, data generated by microarrays often contains multiple missing gene expressions that affect the subsequent analysis, as most of the times these missing values are ignored. In this paper we have analyzed how accurate esti...
متن کاملM Effect of Missing Value Methods on Bayesian Network Classification of Hepatitis Data
Missing value imputation methods are widely used in solving missing value problems during statistical analysis. For classification tasks, these imputation methods can affect the accuracy of the Bayesian network classifiers. This paper study’s the effect of missing value treatment on the prediction accuracy of four Bayesian network classifiers used to predict death in acute chronic Hepatitis pat...
متن کاملEvaluation of Missing Value Estimation for Microarray Data
Microarray gene expression data contains missing values (MVs). However, some methods for downstream analyses, including some prediction tools, require a complete expression data matrix. Current methods for estimating the MVs include sample mean and K-nearest neighbors (KNN). Whether the accuracy of estimation (imputation) methods depends on the actual gene expression has not been thoroughly inv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007